Gangneung
Robust Causal Directionality Inference in Quantum Inference under MNAR Observation and High-Dimensional Noise
In quantum mechanics, observation actively shapes the system, paralleling the statistical notion of Missing Not At Random (MNAR). This study introduces a unified framework for \textbf{robust causal directionality inference} in quantum engineering, determining whether relations are system$\to$observation, observation$\to$system, or bidirectional. The method integrates CVAE-based latent constraints, MNAR-aware selection models, GEE-stabilized regression, penalized empirical likelihood, and Bayesian optimization. It jointly addresses quantum and classical noise while uncovering causal directionality, with theoretical guarantees for double robustness, perturbation stability, and oracle inequalities. Simulation and real-data analyses (TCGA gene expression, proteomics) show that the proposed MNAR-stabilized CVAE+GEE+AIPW+PEL framework achieves lower bias and variance, near-nominal coverage, and superior quantum-specific diagnostics. This establishes robust causal directionality inference as a key methodological advance for reliable quantum engineering.
LLM-as-a-Judge & Reward Model: What They Can and Cannot Do
Son, Guijin, Ko, Hyunwoo, Lee, Hoyoung, Kim, Yewon, Hong, Seunghyeok
LLM-as-a-Judge and reward models are widely used alternatives of multiple-choice questions or human annotators for large language model (LLM) evaluation. Their efficacy shines in evaluating long-form responses, serving a critical role as evaluators of leaderboards and as proxies to align LLMs via reinforcement learning. However, despite their popularity, their effectiveness outside of English remains largely unexplored. In this paper, we conduct a comprehensive analysis on automated evaluators, reporting key findings on their behavior in a non-English environment. First, we discover that English evaluation capabilities significantly influence language-specific capabilities, often more than the language proficiency itself, enabling evaluators trained in English to easily transfer their skills to other languages. Second, we identify critical shortcomings, where LLMs fail to detect and penalize errors, such as factual inaccuracies, cultural misrepresentations, and the presence of unwanted language. Finally, we release Kudge, the first non-English meta-evaluation dataset containing 5,012 human annotations in Korean.
Toward Robust LiDAR based 3D Object Detection via Density-Aware Adaptive Thresholding
Lee, Eunho, Jung, Minwoo, Kim, Ayoung
Robust 3D object detection is a core challenge for autonomous mobile systems in field robotics. To tackle this issue, many researchers have demonstrated improvements in 3D object detection performance in datasets. However, real-world urban scenarios with unstructured and dynamic situations can still lead to numerous false positives, posing a challenge for robust 3D object detection models. This paper presents a post-processing algorithm that dynamically adjusts object detection thresholds based on the distance from the ego-vehicle. 3D object detection models usually perform well in detecting nearby objects but may exhibit suboptimal performance for distant ones. While conventional perception algorithms typically employ a single threshold in post-processing, the proposed algorithm addresses this issue by employing adaptive thresholds based on the distance from the ego-vehicle, minimizing false negatives and reducing false positives in urban scenarios. The results show performance enhancements in 3D object detection models across a range of scenarios, not only in dynamic urban road conditions but also in scenarios involving adverse weather conditions.
Demystifying CLIP Data
Xu, Hu, Xie, Saining, Tan, Xiaoqing Ellen, Huang, Po-Yao, Howes, Russell, Sharma, Vasu, Li, Shang-Wen, Ghosh, Gargi, Zettlemoyer, Luke, Feichtenhofer, Christoph
Contrastive Language-Image Pre-training (CLIP) is an approach that has advanced research and applications in computer vision, fueling modern recognition systems and generative models. We believe that the main ingredient to the success of CLIP is its data and not the model architecture or pre-training objective. However, CLIP only provides very limited information about its data and how it has been collected, leading to works that aim to reproduce CLIP's data by filtering with its model parameters. In this work, we intend to reveal CLIP's data curation approach and in our pursuit of making it open to the community introduce Metadata-Curated Language-Image Pre-training (MetaCLIP). MetaCLIP takes a raw data pool and metadata (derived from CLIP's concepts) and yields a balanced subset over the metadata distribution. Our experimental study rigorously isolates the model and training settings, concentrating solely on data. MetaCLIP applied to CommonCrawl with 400M image-text data pairs outperforms CLIP's data on multiple standard benchmarks. In zero-shot ImageNet classification, MetaCLIP achieves 70.8% accuracy, surpassing CLIP's 68.3% on ViT-B models. Scaling to 1B data, while maintaining the same training budget, attains 72.4%. Our observations hold across various model sizes, exemplified by ViT-H achieving 80.5%, without any bells-and-whistles. Curation code and training data distribution on metadata is made available at https://github.com/facebookresearch/MetaCLIP.
Facial Recognition Will be Used at Tokyo Olympics to Improve Security - Latest Hacking News
In the year 2020, Tokyo will essentially be breaking a tech record when they become the very first Olympics to use facial recognition technology in an attempt to improve security. On Tuesday, the organizing committee announced that Tokyo will be utilizing the tech for indentifying officials, athletes, media, and staff at the 2020 Paralympics and Olympics Games. It will, however, not be utilized in the identification of any spectators who attend the events. The utilization of the facial recognition technology is an effort to both increase security and hasten the entrance of authorized individuals. The Japanese IT conglomerate, NEC Corp, is currently developing the system.
Machine Learning in High Energy Physics Community White Paper
Albertsson, Kim, Altoe, Piero, Anderson, Dustin, Andrews, Michael, Espinosa, Juan Pedro Araque, Aurisano, Adam, Basara, Laurent, Bevan, Adrian, Bhimji, Wahid, Bonacorsi, Daniele, Calafiura, Paolo, Campanelli, Mario, Capps, Louis, Carminati, Federico, Carrazza, Stefano, Childers, Taylor, Coniavitis, Elias, Cranmer, Kyle, David, Claire, Davis, Douglas, Duarte, Javier, Erdmann, Martin, Eschle, Jonas, Farbin, Amir, Feickert, Matthew, Castro, Nuno Filipe, Fitzpatrick, Conor, Floris, Michele, Forti, Alessandra, Garra-Tico, Jordi, Gemmler, Jochen, Girone, Maria, Glaysher, Paul, Gleyzer, Sergei, Gligorov, Vladimir, Golling, Tobias, Graw, Jonas, Gray, Lindsey, Greenwood, Dick, Hacker, Thomas, Harvey, John, Hegner, Benedikt, Heinrich, Lukas, Hooberman, Ben, Junggeburth, Johannes, Kagan, Michael, Kane, Meghan, Kanishchev, Konstantin, Karpiลski, Przemysลaw, Kassabov, Zahari, Kaul, Gaurav, Kcira, Dorian, Keck, Thomas, Klimentov, Alexei, Kowalkowski, Jim, Kreczko, Luke, Kurepin, Alexander, Kutschke, Rob, Kuznetsov, Valentin, Kรถhler, Nicolas, Lakomov, Igor, Lannon, Kevin, Lassnig, Mario, Limosani, Antonio, Louppe, Gilles, Mangu, Aashrita, Mato, Pere, Meenakshi, Narain, Meinhard, Helge, Menasce, Dario, Moneta, Lorenzo, Moortgat, Seth, Neubauer, Mark, Newman, Harvey, Pabst, Hans, Paganini, Michela, Paulini, Manfred, Perdue, Gabriel, Perez, Uzziel, Picazio, Attilio, Pivarski, Jim, Prosper, Harrison, Psihas, Fernanda, Radovic, Alexander, Reece, Ryan, Rinkevicius, Aurelius, Rodrigues, Eduardo, Rorie, Jamal, Rousseau, David, Sauers, Aaron, Schramm, Steven, Schwartzman, Ariel, Severini, Horst, Seyfert, Paul, Siroky, Filip, Skazytkin, Konstantin, Sokoloff, Mike, Stewart, Graeme, Stienen, Bob, Stockdale, Ian, Strong, Giles, Thais, Savannah, Tomko, Karen, Upfal, Eli, Usai, Emanuele, Ustyuzhanin, Andrey, Vala, Martin, Vallecorsa, Sofia, Verzetti, Mauro, Vilasรญs-Cardona, Xavier, Vlimant, Jean-Roch, Vukotic, Ilija, Wang, Sean-Jiun, Watts, Gordon, Williams, Michael, Wu, Wenjing, Wunsch, Stefan, Zapata, Omar
The main objectives of particle physics in the post-Higgs boson discovery era is to exploit the full physics potential of both the Large Hadron Collider (LHC) and its upgrade, the high luminosity LHC (HL-LHC), in addition to present and future neutrino experiments. The HL-LHC will deliver data from 100 times the luminosity compared to the LHC, bringing quantitatively and qualitatively new challenges due to event size, data volume, and complexity. The physics reach of the experiments will be limited by the physics performance of algorithms and computational resources. Machine learning (ML) applied to particle physics promises to provide improvements in both of these areas. Incorporating machine learning in particle physics workflows will require significant research and development over the next five years. Areas where significant improvements are needed include: - Physics performance of reconstruction and analysis algorithms; - Execution time of computationally expensive parts of event simulation, pattern recognition, and calibration; - Realtime implementation of machine learning algorithms; - Reduction of the data footprint with data compression, placement and access.
DJI will create no-fly zones around Olympic venues in South Korea
Days ago, South Korean authorities announced that they'd capture any drone that got too close to Olympics event facilities. If you have a DJI-made craft, you won't even be able to get close. The UAV maker is releasing a software patch that creates a no-fly zone around Olympic areas. For the duration of the games, DJI drones won't be able to fly through areas in the South Korean cities of Pyeongchang, Gangneung, Bongpyeong and Jeongseon. "Safety is DJI's top priority and we've always taken proactive steps to educate our customers to operate within the law and where appropriate, implement temporary no-fly zones during major events," the company said in a statement, according to TechCrunch.
Robots Ready to Ski, Paint, and Clean at South Korea's 2018 Winter Olympics
Spectators who get lost in Olympic Plaza in Pyeongchang next month can ask for directions from a nearby guide that speaks four languages. Thirsty patrons who visit Gangneung Media Village can order drinks for delivery. And they can do all of this without talking to another human. South Korea is going big on robotics for the 2018 Winter Olympics, which begin on 9 February. Organizers will deploy about 80 robots at the games to showcase the nation's leader ship in advanced robotics research.